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ABSTRACT 

Bright sub-mm galaxies are expected to arise in massive highly-biased haloes, and 
hence exhibit strong clustering. We argue that a valuable tool for measuring these 
clustering properties is the cross-correlation of sub-mm galaxies with faint optically- 
selected sources. We analyze populations of SCUBA-detected and optical galaxies in 
the GOODS-N survey area. Using optical/IR photomctric-rcdshift information, wc 
search for correlations induced by two separate effects: (1) cosmic magnification of 
background sub-mm sources by foreground dark matter haloes traced by optical galax- 
ies at lower redshifts; and (2) galaxy clustering due to sub-mm and optical sources 
tracing the same population of haloes where their redshift distributions overlap. Re- 
garding cosmic magnification, we find no detectable correlation. Our null result is 
consistent with a theoretical model for the cosmic magnification, and we show that a 
dramatic increase in the number of sub-mm sources will be required to measure the 
effect reliably. Regarding clustering, we find evidence at the 3.5-cr level for a cross- 
correlation between sub-mm and optical galaxies analyzed in identical photometric 
redshift slices. The data hint that the sub-mm sources have an enhanced bias parame- 
ter compared to the optically-selected population (with a significance of 2-er) . The next 
generation of deep sub-mm surveys can potentially perform an accurate measurement 
of each of these cross-correlations, adding a new set of diagnostics for understanding 
the development of massive structure in the Universe. 

Key words: large-scale structure of Universe - cosmology: observations - galaxies: 
starburst - submillimetre 



1 INTRODUCTION 

Deep sub-mm surveys, using the Submillimetre Common 
User Bolometer Array (SCUBA; Holland et al. 1999) at the 
James Clerk Maxwell Telescope, have revolutionized our un- 
derstanding of the high-redshift Universe by discovering a 
new population of distant highly star-forming dusty galax- 
ies (see Blain et al. 2002 and references therein) . Despite the 
roughly 15 arcsec SCUBA beam-size (at 850 /im) and their 
typical optical faintness, these sources have gradually been 
optically identified, aided by a combination of deep radio ob- 
servations from the Very Large Array (e.g. Ivison et al. 2002) 
and, more recently, data from the Spitzer Space Telescope. 
Follow-up spectroscopy of the counterparts has confirmed 
that the median redshift of the sub-mm population is high, 

2 ~ 2 (Chapman et al. 2005). 

As the depth and area of sub-mm surveys increase, the 



population can be characterized statistically by determining 
basic properties such as the luminosity function and the clus- 
tering amplitude. Such measurements will allow the sub-mm 
population to be described within the framework of models 
for the formation of massive galaxies (e.g. Baugh et al. 2005). 
The clustering properties of a class of galaxies, interpreted 
in terms of cold dark matter type models, is a key mea- 
surement (see e.g. van Kampen et al. 2005). The bias of a 
galaxy population traces the global environment it inhabits, 
and can be linked to a representative mass of dark matter 
halo. Such information reflects upon the formation mecha- 
nism of the population, and allows evolutionary sequences 
to be inferred between objects at high and low redshifts. In 
particular, if sub-mm galaxies originate from galaxy merg- 
ers, then they should be highly biased with respect to the 
underlying dark matter, given that mergers occur in high- 
density environments in the high-redshift Universe (Percival 
et al. 2003). If the bias is found to be high, then this would 
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constitute direct evidence that sub-mm galaxies are indeed 
the progenitors of today's massive elliptical galaxies. 

Existing surveys of sub-mm galaxies are inadequate for 
accurately performing a direct determination of their clus- 
tering properties via measurement of auto-correlation func- 
tions. So far there have been at best tentative detections of 
such clustering (e.g. Blain et al. 2004). An alternative ap- 
proch is to measure instead the cross-correlation function 
of the sub-mm population and the dark matter distribution 
traced by more numerous optically-selected galaxies. Such 
an analysis was performed by Almaini et al. (2005), using 
the 8-mJy SCUBA surveys of Scott et al. (2002) and the 
shallower scan map of the Hubble Deep Field (Borys et al. 
2002). A significant cross-correlation was detected between 
these sub-mm data-sets and optical follow-up images. Inter- 
estingly, the measured correlation was between the sub-mm 
sources at high redshift and relatively bright optical galaxies 
at lower redshifts. This led Almaini et al. to suggest that the 
most likely explanation for the cross-correlation was the phe- 
nomemon of cosmic magnification, by which the background 
sub-mm galaxies experience gravitational lensing by fore- 
ground dark matter haloes traced by the bright optically- 
selected galaxies. A similar effect has recently been observed 
for the cross-correlation between background quasars and 
foreground galaxies in the Sloan Digital Sky Survey (Scran- 
ton et al. 2005). 

The amplitude of the cosmic magnification is deter- 
mined by several factors, including the dark matter power 
spectrum and growth function, but also the slope of the flux 
distribution of the background population. In the case of 
sub-mm galaxies this slope is exceptionally steep, leading 
to a relatively large cross-correlation. A similar explanation 
was posited to explain an earlier measured correlation be- 
tween sub-mm galaxies and X-ray selected sources (Almaini 
et al. 2003). An alternative possibility was also suggested: 
that a higher-than-expected fraction of sub-mm galaxies lie 
at relatively low redshifts z < 1, such that the two classes of 
objects partially trace the same large-scale structure. 

In this study we analyze the cross-correlation be- 
tween sub-mm and optically-selected galaxies using data- 
sets from the Great Observatories Origins Deep Survey 
North (GOODS-N) region (Giavalisco et al. 2004). There are 
several advantages to using this field: (1) a robust and sub- 
stantial catalogue of sub-mm sources exists, extracted from 
a well-understood compilation of SCUBA data (Borys et al. 
2003, 2004; Pope et al. 2005); (2) there is almost complete 
identification of the SCUBA counterparts, including spec- 
troscopic redshifts for almost half of the objects and photo- 
metric redshift estimates for the rest; and (3) deep Hubble 
Space Telescope (HST) ACS observations have been taken 
of the entire field, providing a high density of optical galax- 
ies with photo-z estimates. The data-sets used are described 
in more detail in Section 2. We pay particular attention to 
the estimator for the cross-correlation function employed, 
as explained in Section 3. Using different photometric red- 
shift cuts, we search for cross-correlations induced by cosmic 
magnification and by galaxy clustering. Our measurements 
are detailed in Section 4, and are compared with theoret- 
ical predictions in Section 5. Prospects for future sub-mm 
surveys are considered in Section 6. 
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Figure 1. Noise map for SCUBA observations in the GOODS- 
N field. The grey-scale represents the noise level, ranging from a 
minimum of 0.3 mjy beam -1 (white) to a maximum of 15 mjy 
beam -1 (black). The circles indicate the positions of the secure 
sample of 34 counterparts extracted from the sub-mm map, with 
the size of the plotted symbol increasing with the brightness of 
the source. We note that the noise at the positions of the ex- 
tracted sources ranges from 0.3 — 4.3 mjy beam -1 . The straight 
boundaries illustrate the extent of the optical ACS observations, 
which have almost uniform sensitivity within this region. Three of 
the 34 sub-mm sources, marked with crosses, are excluded from 
our cross-correlation analysis. One lies just outside the area of 
uniform optical sensitivity, while two others are within 30 arcsec 
of higher signal-to-noise sub-mm sources where the object extrac- 
tion is particularly difficult, as discussed in Section 4.1. 



2 DATA-SETS 

The GOODS-N region covers an area of approximately 
10 x 16.5 arcmin, centred on 12 h 36 m 55 s , +62°14'15" (Gi- 
avalisco et al. 2004). All of the SCUBA data from several ex- 
tensive imaging campaigns in the GOODS-N field have been 
combined into one sub-mm map, which we refer to as the 
'super-map' (see Borys et al. 2003 and references therein). 
The resulting noise map and positions of extracted sources 
are displayed in Fig. 1. Since the super-map is a compila- 
tion of essentially all SCUBA data taken in the field, the 
associated noise map is extremely non-uniform. The most 
recent version of the GOODS-N super-map contains forty 
850-^im sources detected above 3.5-cr (Pope et al. 2005). 
From Monte Carlo simulations, we expect to find ~ 3 spuri- 
ous sources in the extraction process (Borys et al. 2003). In 
order to refine our secure catalogue, we have explored the 
level of flux boosting (also referred to as Malmquist and/or 
Eddington bias) of these sources by applying the Bayesian 
approach discussed in Coppin et al. (2005). By simulating 
a distribution of pixel values, which depends on the chop- 
ping pattern, we determined the de-boosted flux for each of 
our sources. We then removed any sources from our sample 
which were severely affected by flux-boosting and possessed 
a non-negligible probability of having zero flux, which left 
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us with 35 secure sub-mm sources. Note that the decision to 
reject sources due to likely flux boosting comes from a sim- 
ple relationship based on signal and noise, and therefore it is 
easy to include the same criterion in our simulated sub-mm 
catalogues (discussed in Section 4.1). 

Using the optical, radio and new Spitzer data in 
GOODS-N we have identified counterparts for all but one of 
the secure sub-mm sources (the details are discussed in Pope 
et al. 2006). The positions of the remaining 34 are plotted in 
Fig. 1. Spectroscopic redshifts are known for about half of 
the sub-mm catalogue; reliable photometric redshifts have 
been estimated for the remainder of the objects using the 
extensive optical and infra-red data (see Figure 1 of Pope et 
al. 2005 and Figure 7 of Pope et al. 2006) . For those sub-mm 
sources possessing both spectroscopic and photometric red- 
shifts, the standard deviation of (z p hot — -Zspcc) /(l + z sp cc) is 
0.10. 

The optical data for the GOODS-N region (Giavalisco 
et al. 2004), obtained with the Advanced Camera for Surveys 
(ACS) on-board the HST, has a uniform sensitivity within 
the boundaries indicated in Fig. 1, which represents the area 
studied in our cross-correlation analysis. We restricted our 
analysis to sources which are detected above 5-cr and there- 
fore, for all but the faintest magnitudes, we are far enough 
above the noise level that the subtle variations in exposure 
time will not affect our results. In addition to the 4 ACS 
bands (B435, V6O6, *775 and zsso), the GOODS-N field has 
also been surveyed with several ground-based facilities, pro- 
viding data in the following bands: U (KPNO, Capak et 
al. 2004); B, V, R, I, z (Subaru, Capak et al. 2004); and J, 
K B (KPNO, Giavalisco et al. 2004). Using all of these optical 
data, photometric redshifts have been calculated for roughly 
half of the ~ 32,000 optically-detected galaxies in GOODS- 
N. The accuracy of these photometric redshifts has been 
determined using the subset of sources (numbering ~ 1,700) 
which also possess spectroscopic redshifts. The standard de- 
viation of (%,hot — -z spcc )/(l + Zspcc) for the optical catalogue 
is 0.11. We note that throughout this paper we use AB mag- 
nitudes. 



3 THE CROSS-CORRELATION FUNCTION 

The cross-correlation function w CTOBB (9) between two galaxy 
populations 1 and 2, in terms of an angular scale 9, is defined 
as the fractional excess in the probability SP, relative to a 
random unclustered distribution, of finding both a galaxy of 
type 1 in a solid angle element Sili and a galaxy of type 2 
in a solid angle element SQ2, separated by angle 9: 



SP = El £ 2 [1 + w croBB {9)} SQt Sn 2 , 



(1) 



where Si and £2 are the surface densities of populations 1 
and 2 (Peebles 1980). 

The cross-correlation function w CTOBB (9) is measured 
from the galaxy distributions by constructing pair counts 
from the data-sets. A pair count between two galaxy popula- 
tions 1 and 2, D\D2{9), is a binned histogram of the separa- 
tions 9 of every galaxy of population 1 relative to all objects 
of population 2. In order to determine the cross-correlation 
function, fully incorporating the effects of the survey geom- 
etry and of statistical fluctuations, we must also generate 
random unclustered realizations of the galaxy distributions 



with the same angular selection functions as the real data. 
The pair counts between the data and random distributions 
(denoted DiRj) measure the actual average available area 
around each object, taking account of the survey window 
function and the distributions of the galaxies relative to the 
boundaries of the sample. In this way, we can construct cor- 
relation function estimators with the smallest bias and vari- 
ance in the angular range under investigation. 

The error in a correlation function estimator is deter- 
mined from the variance of individual pair counts. It is im- 
portant to note that, if a separation bin contains N galaxy 
pairs, then the statistical variance in this bin will in gen- 
eral exceed the 'Poisson error' y/N, even for an unclustered 
distribution of objects, as can be demonstrated by simu- 
lations or analytic calculations (e.g. Landy & Szalay 1993; 
Hamilton 1993; Bernstein 1994; Bernardeau et al. 2002). The 
increase in variance compared to the Poisson prediction de- 
pends on the survey geometry, but can be considerable for a 
sub-optimal estimator when the pair separation 9 is not neg- 
ligible compared to the survey dimensions. A fundamental 
cause of the excess non-Poisson variance is edge effects: the 
position of sources relative to the boundaries of the sample 
is important in determining the distribution of pair sepa- 
rations (i.e. a source distant #0 from an edge is less likely 
to participate in pairs of separation 9 > 9o). The true vari- 
ance of the correlation function estimator may be measured 
by techniques such as jack-knife re-sampling or Monte-Carlo 
simulations. 

Various estimators for the cross-correlation function 
have been proposed. Two commonly used but (in general) 
sub-optimal estimators are: 



and w z 



= DxD 2 
D2R1 

m _D 1 D 2 



(2) 
(3) 



These two estimators are potentially biased, because in each 
case random realizations of only one of the two data-sets 
have been created, thus the statistical fluctuations and edge 
effects due to the distribution of sources in the other data- 
set have not been taken into account. Furthermore, equa- 
tions 2 and 3 are not invariant when the indices (1,2) are 
interchanged.^ In addition the variance of these estimators 
may significantly exceed the Poisson prediction, depending 
on the survey geometry (e.g. Landy & Szalay 1993). Better 
estimators for the cross-correlation function are: 



= D1D2 x R±R 2 
31 ' Dli?2 x D2R1 ' 



(4) 



... D 1 D 2 -D 1 R2-D 2 Ri+RiR 2 ,_, 

and w C ross(0) = j— . (5) 

ix\ti2 

These two estimators are modified versions of those origi- 
nally suggested for the auto-correlation function by Hamil- 
ton (1993) and Landy & Szalay (1993), respectively. In each 



t This is a clue hinting that the estimators are biased, since linear 
bias terms in one or other data-set may still be present. In general 
a good cross-correlation estimator should not answer one of the 2 
questions 'is data-set 1 correlated with data-set 2' or 'is data-set 
2 correlated with data-set 1', but should answer both. 
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case statistical fluctuations in both data-sets have been in- 
corporated, and the equations are symmetrical in the indices 
(1,2). 



4 CROSS-CORRELATION MEASUREMENTS 
4.1 Generation of random catalogues 

In order to measure the cross-correlation function robustly 
we must generate random (unclustered) comparison data- 
sets for each of our surveys, possessing the same angular 
selection functions as the survey data. For the optical ob- 
servations, we generated random catalogues by uniformly 
populating the region delineated by the straight boundaries 
in Fig. 1. One of the 34 secure sub- mm counterparts lies 
outside the area of uniform optical sensitivity, as indicated 
in Fig. 1, and as a result was excluded from our analysis. 

For the sub- mm data-set, we generated random distri- 
butions using the noise map plotted in Fig. 1. Firstly, can- 
didate sources were randomly generated inside the analysis 
region from the flux distribution fitted to this sub-mm data- 
set by Borys et al. (2003): 



N(S) oc 



(fr + (D 



(6) 



where N(S) dS is the number of sources with fluxes between 
S and S + dS. The best-fitting values of the parameters are 
approximately a = 1, /3 = 3.3, and So = 1.8 mjy. This dis- 
tribution represents a number-counts slope steepening with 
increasing flux, and is a reasonable fit to all existing SCUBA 
data (although the detailed form of this function is not im- 
portant - any function which fits the current data would 
suffice for our purpose). The candidate source was only re- 
tained if its signal-to-noise ratio determined by the noise 
map exceeded 3.5 and if it satisfied the flux-deboosting cri- 
terion described in Section 2. 

Secondly, we must take account of the angular resolu- 
tion of the sub-mm observations, otherwise the highly non- 
uniform noise distribution will cause the deepest portions of 
the map to be over-populated by random sources in compar- 
ison to the real data. Although the beam size of the SCUBA 
instrument at 850 fim is about 15 arcsec, the sub-mm cat- 
alogue displays a dearth of pairs separated by less than 30 
arcsec (compared to the number of pairs expected by ran- 
dom chance), owing to the difficulty of fitting very close 
pairs of sources in the extraction process. Therefore, we re- 
jected a candidate random source if its putative position was 
closer than 30 arcsec to an existing random source with a 
higher signal-to-noise ratio. If the near neighbour possessed 
a lower signal-to-noise ratio than the candidate object, it 
was expunged from the random catalogue in favour of the 
new source. The final distributions of fluxes in our random 
catalogues were found to agree reasonably well with that ob- 
served for the real sub-mm data-set. The sub-mm catalogue 
plotted in Fig. 1 does in fact contain two pairs separated by 
less than 30 arcsec: for consistency, the source with the low- 
est signal-to-noise ratio was removed in each case, leaving 
us with a total of 31 sub-mm objects. 



4.2 Determination of errors 

As discussed in Section 3, the assumption of Poisson errors 
can be a poor approximation for the variance in a correlation 
function measurement. Moreover, this approach cannot es- 
tablish the covariances between separation bins. A common 
technique for improving the error determination is jack-knife 
re-sampling, in which correlation functions are measured for 
many sub-samples of the data-sets in order to estimate the 
statistical fluctuations and covariance (e.g. Scranton et al. 
2002). However, our sub-mm catalogue is too small to allow 
the reliable application of this method. 

We therefore determined the covariance matrix of each 
correlation function measurement using Monte Carlo real- 
izations, in which the actual data-sets were substituted by 
random realizations generated as described in Section 4.1. 
This is an acceptable approximation given that the angu- 
lar clustering of these data-sets is weak compared to the 
shot noise error. Each mock realization of the data was an- 
alyzed by our correlation function pipeline, and the results 
for many realizations enabled the covariance matrix to be 
constructed. 



4.3 Bias and variance of estimators 

Fig. 2 indicates how the measured cross-correlation function 
depends on the estimator employed, analyzing a test case 
consisting of all 31 sub-mm sources and a bright 2-filter mag- 
nitude slice of ACS galaxies (20 < m z < 22). Throughout 
this paper we measure the cross-correlation function up to 
an angular scale of 5 arcmin in 10 bins of width 0.5 arcmin. 
The plotted error bars always correspond to the diagonal 
elements of the covariance matrix measured as described in 
Section 4.2. The 'good' estimators for w croBB (0) (equations 
4 and 5) produce mutually consistent results that display 
no evidence for cross-correlation. 'Estimator 1' (equation 
2) displays a strong positive cross-correlation. 'Estimator 
2' (equation 3) is unbiased for these distributions, but pos- 
sesses a significantly increased variance. 

For 'estimator 1', which displays the strong bias in 
Fig. 2, we averaged over random optical data-sets, but not 
random sub-mm catalogues. This is problematic, because 
our actual sub-mm data-set (Fig. 1) happens by chance to 
have more sources in the lower half of the field than the aver- 
age realization, accidently coinciding with an overdensity in 
the optical data-set. In terms of pair counts, fl S ub-mm-Dopt is 
spuriously high in comparison to Aub-mm&pt, causing the 
positive offset in the cross-correlation function over a range 
of scales. We emphasize that this is purely an artefact of us- 
ing a sub-optimal estimator - these large-scale fluctuations 
are entirely consistent with random realizations of the data- 
sets, but poor estimators can mistakenly assign significance 
to this. 

In Fig. 3 we plot the ratio between the correlation func- 
tion error determined by the Monte Carlo realizations and 
the Poisson error, as a function of scale for the four esti- 
mators. The 'good' estimators of equations 4 and 5 perform 
best in terms of variance as well as bias, approaching the 
Poisson prediction. 'Estimator 2' (equation 3) possesses a 
significantly larger variance than that predicted by Poisson 
statistics. 

For the rest of this paper we chose to use the Landy- 
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Figure 2. The cross-correlation function of the SCUBA and ACS 
data-sets measured by a variety of estimators. In this test case we 
restricted the ACS sample to galaxies in the 2-filter magnitude 
range 20 < m z < 22. Estimators 1 to 4 correspond to equations 
2, 3, 4 and 5, respectively, whilst labels '1' and '2' in these equa- 
tions refer to the optical and sub-mm data-sets, respectively. The 
errors are determined by Monte Carlo realizations. The separate 
measurements are offset along the x-axis for clarity. 
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Figure 3. The standard deviation in the cross-correlation func- 
tion of the SCUBA and ACS data-sets measured by a variety of 
estimators, as determined by Monte Carlo realizations. In this test 
case we analyzed the same sample of ACS galaxies as in Fig. 2, 
and estimators 1 to 4 have the same correspondences. The errors 
are normalized to the prediction for purely Poisson statistics. 



Szalay-based estimator for the cross-correlation function 
(equation 5, 'estimator 4'). In all cases we determined the 
pair counts D1R2, D2R1 and R1R2 by averaging over 10 ran- 
dom catalogues, each containing the same number of galax- 
ies as the real survey data-sets. 



6 / arcmin 

Figure 4. Cross-correlation function of the SCUBA and ACS 
data-sets, measured for samples selected in a manner which 
should be efficient for the detection of cosmic magnification. The 
measurements contain weak evidence for a small positive constant 
offset from zero, but we do not interpret this as an astrophysically- 
significant cross-correlation, as explained in the text. 



tical sources and high-redshift (1 < 2 < 5) sub-mm objects. 
We only utilized galaxies with 'high-quality' photometric 
redshifts with errors better than 8z = 0.4 (95% confidence 
limit, see Mobasher et al. 2004). 

The result is plotted in Fig. 4. There is weak evidence 
that w cvoss (9) is inconsistent with zero. In fact a small pos- 
itive constant value w ~ 0.02 provides a good fit (using the 
full covariance matrix). However, we do not interpret this as 
evidence for an astrophysically-significant cross-correlation, 
given that the expected form of such a correlation is strongly 
scale-dependent (see Section 5). In contrast, any residual 
systematic problems - such as uncertainties in the underly- 
ing mean source density, the integral constraint correction 
for small fields, or unrecognised calibration fluctuations - 
would produce a small constant offset in the correlation func- 
tion (e.g. Blake & Wall 2002). 

Exploring further, we also measured the cross- 
correlation function between all SCUBA sources and var- 
ious sub-samples of the ACS galaxies. Fig. 5 displays the 
results measured in different magnitude bands. Fig. 6 plots 
the measurements in redshift bands, only including those 
ACS galaxies with high-quality photometric redshifts. In no 
case do we find a significant detection of a cross-correlation 
(that cannot be fit by a small constant offset). We also de- 
tected no significant cross-correlation by varying the flux 
threshold of the sub-mm galaxies. Bright (Ssso^m > 10 mjy) 
sub-mm sources are expected to have the steepest flux distri- 
bution and hence the strongest cross-correlation. However, 
the number of such sources in our sample is very small. 



4.4 Attempted detection of cosmic magnification 

One potential source of genuine cross-correlation between 
our data-sets is gravitational lensing ('cosmic magnifica- 
tion') of background sub-mm sources by dark matter haloes 
traced by foreground optical galaxies; we investigate this ef- 
fect first. In order to optimize any detection of cosmic mag- 
nification, we used the photometric redshift information to 
restrict our comparison to low-redshift (0.2 < z < 0.8) op- 



4.5 Attempted detection of galaxy clustering 

The second potential source of genuine cross-correlation be- 
tween our data-sets is galaxy clustering, which would arise 
if the sub-mm and optical objects traced the same popula- 
tion of dark matter haloes in the case of overlapping red- 
shift distributions. In order to efficiently search for clus- 
tering we again used the photometric redshift information, 
firstly restricting our data-sets to galaxies with high-quality 
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Figure 5. Cross-correlation function of the SCUBA and ACS 
data-sets, measured in z-filter magnitude intervals of the ACS 
sources. The solid line is the cross-correlation result using the 
whole optical sample. The separate measurements arc offset along 
the i-axis for clarity. 
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Figure 6. Cross-correlation function of the SCUBA and ACS 
data-sets, measured in redshift bands of the ACS sources. The 
solid line is the cross-correlation result using the whole optical 
sample. Only optical galaxies with 'high-quality' photometric red- 
shifts are included in the analysis. The separate measurements are 
offset along the rr-axis for clarity. 



photometric redshifts as defined above. We also limited the 
input sub-mm and optical data-sets to the redshift range 
0.8 < z < 2.0 in order to maximize the overlap of the cata- 
logues. We then measured the pair counts in angular separa- 
tion bins, weighting each galaxy pair by a factor depending 
on its redshift difference Sz — z\ — 22: 



Weight = exp 



(7) 



where C = 0.1 (determined empirically to optimize the 
signal-to- noise ratio of the measurement). These weights 
strongly increase the contribution of pairs at similar red- 
shifts, optimizing any detection of mutual clustering. The 
cross-correlation function was then determined using the 
usual combination of the (weighted) pair counts. The red- 
shifts of sources in the random comparison catalogues were 
assigned by randomizing the redshifts of the real data. The 
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Figure 7. The solid circles plot the cross-correlation function 
of the SCUBA and ACS data-sets, measured in a manner de- 
signed to optimize any detection of galaxy clustering. The open 
circles display a measurement of the same statistic for an auto- 
correlation function of the ACS galaxies. The lines represent the 
best fits of a power-law model w(9) = A $ -1 to separations in the 
range 6 < 3 arcmin. 



error in the correlation function was determined using Monte 
Carlo realizations, as before. 

The result is plotted as the solid circles in Fig. 7, and 
shows evidence for a positive correlation on small scales. 
Fitting a power-law model w(9) = AO -1 to the result allows 
us to reject a model with no correlation (A — 0) with a 
significance level of 3.5-er. The slope of 1.0 was determined 
by fitting a power law to the higher signal-to-noise auto- 
correlation function of the ACS galaxies, as described below, 
and provides a better fit than the canonical slope of 0.8. 
We checked that adding data with lower-quality photometric 
redshifts or changing the value of C in equation 7 did not 
improve the significance of the detection. 

For comparison, we repeated the calculation analyz- 
ing the clustering of the ACS galaxies alone via an auto- 
correlation function, weighting galaxy pairs as before. The 
result, plotted as the open circles in Fig. 7, can be estab- 
lished with greater significance, owing to the larger optical 
sample. The auto-correlation amplitude appears to be lower 
than the cross-correlation amplitude on the smallest scales 
(the significance of the offset is 2-a). A tentative interpreta- 
tion of this finding would be the expected higher clustering 
bias of sub-mm galaxies. This is consistent with suggestions 
of clustering of sub-mm galaxies in redshift-space (Blain et 
al. 2004). We note that differences in the redshift distri- 
butions of the sub-mm and optical sources will affect this 
comparison: the ACS catalogue contains more objects at 
the lower end of the redshift range analyzed. However, for 
these low-redshift pairs a given angular scale corresponds 
to a smaller physical scale, which will boost the correlation 
amplitude. Therefore, the 2-er significance of the discrepancy 
between the cross-correlation and auto-correlation functions 
in Fig. 7 is conservative. 

There are several ways in which one might try to refine 
this procedure, including: changing the weighting scheme for 
SCUBA galaxies which have spectroscopic redshifts; adapt- 
ing the weight of each pair depending on the quality of the 
photometric redshift (s); using cuts on colour as well as mag- 
nitude; etc. We have not performed an exhaustive investiga- 
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tion of these issues, since what is really required to achieve 
definitive measurements is larger sub-mm surveys, as dis- 
cussed in Section 6. 



5 CROSS-CORRELATION MODELLING 

5.1 Cosmic magnification prediction 

In Section 4.4 we failed to detect any evidence for cross- 
correlation between our sub-mm and optical data-sets of the 
form which might be induced by cosmic magnification. We 
now compare this result to theoretical predictions for the 
size of this effect. 

The magnitude of the cross-correlation amplitude in- 
duced by cosmic magnification is determined in part by the 
slope of the differential number-counts for the background 
population: w CTOSS oc ((3 — 1), where the number of sources 
with fluxes between S and S + dS is given by S - ' 3 dS. The 
number-counts slope for bright sub-mm sources is known to 
be exceptionally steep, f3 ~ 3, increasing the effect of cos- 
mic magnification relative to populations with more shallow 
counts (such as quasars, where the maximum slope ft ~ 2 is 
only reached for the very brightest sources, Scranton et al. 
2005). For fainter sub-mm galaxies the effective value of ft 
becomes smaller, in accordance with equation 6. 

The cross-correlation function due to cosmic magnifica- 
tion can be predicted using simple models (e.g. Moessner & 
Jain 1998). If we assume that the source and lens popula- 
tions are at fixed redshifts z s and z\, then 



wa{8) 



30,09-1) bigis 
a(zi) 



2txL\ 



kP(k,zi)J o (kx(zi)0)dk, (8) 



where O m is the present-day matter density relative to the 
critical density, Lh = c/Ho is the Hubble length in units 
of h^ 1 Mpc, b\ is the linear biassing factor for the lenses, 
g\ s = x(zi)[x(z s ) — x(zi)]/x(z s ) is the geometrical factor for 
lensing, x(z) is the co-moving angular diameter distance for 
a flat Universe, a (z) is the usual cosmological scale factor, 
P(k, z) is the matter power spectrum at redshift z (including 
non- linear clustering), and Jo (it) is the zeroth-order Bessel 
function. In equation 8 the units of x(z), k and P(k,z) are 
h~ 1 Mpc, ft Mpc -1 and ft -3 Mpc 3 , respectively. 

We considered a foreground lens population with bi = 1 
at zi = 0.5 and a background source population at z s — 2. 
We assumed a spatially-flat cosmology with cosmological pa- 
rameters fi m = 0.3 and Ho = 70 km s _1 Mpc -1 , and gener- 
ated a non-linear power spectrum using the prescription of 
Peacock & Dodds (1994), using a linear power spectrum pro- 
duced from the fitting formulae of Eisenstein & Hu (1998), 
with baryon fraction f2b/fi m = 0.15, spectral index n s — 1 
and zero-redshift normalization erg = 0.8 (scaled to redshift 
z\ using the linear growth factor of Carroll, Press & Turner 
1992). In Fig. 8 we plot the resulting cross-correlation func- 
tion of equation 8 for number-count slopes between j3 = 2.0 
and f3 = 3.0. Realistically, the redshift distributions of the 
background and foreground populations will be broader than 
the infinitesimal shells we have assumed; this will lower the 
amplitude of the cross-correlation, owing to the geometrical 
factor <7i a in equation 8. 

As can be seen, the predictions of Fig. 8 provide an en- 
tirely acceptable fit to the null measurement of cosmic mag- 
nification plotted in Fig. 4 (which is reproduced in Fig. 8 for 



~1 — 1 — 1 — <- 
P = 2.0 
= 2.5 
S = 3.0 



6 / arcmin 

Figure 8. Model cross-correlation function for cosmic magnifica- 
tion of a background source population at z s = 2 by a foreground 
lens population at Z\ = 0.5. Different curves correspond to dif- 
ferent slopes /3 of the number-counts relation for the background 
population. The plotted data points are a reproduction of our 
measurement from Fig. 4. 



comparison). However, our results appear to disagree some- 
what with those of Almaini et al. (2005, Figures 1 and 2), 
who observed a significantly higher cross-correlation ampli- 
tude between sub-mm and optically-selected galaxies, with 
their favoured explanation being lensing magnification. The 
signal-to-noise of the measurements is low, but we find it dif- 
ficult to reconcile these results, corresponding to w B \ ~ 0.2 
on small scales 6 ~ 30", with our lensing model - unless the 
fields studied happen to contain a highly dis-proportionate 
concentration of massive lenses. 

As an alternative and cruder model, we can estimate 
the quantity of lenses required to generate a given cross- 
correlation signal by assuming a simple distribution of sin- 
gular isothermal spheres. Denoting the lensing magnification 
factor as [i, the enhancement in surface density of back- 
ground sources is given by /i* 3-1 , where (3 is the slope of the 
differential number counts distribution (as defined above). 
For a background source at (lensed) angular separation d 
from an isothermal lens with Einstein radius #e, 



/' : 



(9) 



(Bartelmann & Schneider 2001, equation 3.19). Assuming 
= 3 (which is appropriate for the brightest SCUBA 
galaxies), a correlation function w(0.5') — 0.2 is gener- 
ated if sources have an average magnification such that 
w ~ /i* 3-1 — 1 or [i ~ 1.1. Using equation 9, the average 
source must be at angular separation 8 — 0.5 arcmin from a 
lens with Einstein radius #e — 2.6". The Einstein radius of 
an isothermal sphere with ID velocity disperson a v is given 
by 



= 47T 



x(z s ) - x(z{) 
x(z s ) 



(10) 



(Bartelmann & Schneider 2001, equation 3.17). Substitut- 
ing the approximate value 0.5 for the geometrical factor, we 
obtain 6>e — (0.6")(<r„/200 kms -1 ) 2 . Hence in this simple 
model, lenses of velocity dispersion a v ~ 420 km s _1 (i.e., 
galaxy groups) must be responsible for the lensing magnifi- 
cation. Given that our background sources must be located 
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on average 0.5 arcmin from these lenses, we estimate that a 
lens surface density ~ 1 arcmin -2 is required. This is consis- 
tent with the density of optical sources at the relevant mag- 
nitude limit, but significantly exceeds the expected density 
of groups (i.e., ~ 0.02 arcmin -2 , e.g. Yan et al. 2004). We 
conclude that the fields observed by Almaini et al. (2005) 
would have to be unusual areas of sky in order to gener- 
ate an angular correlation function w(0.5') ~ 0.2 by lensing 
magnification. We note that if w(0.5') = 0.04 (Fig. 8), then 
the equivalent lens velocity dispersion is a v ~ 200 km s -1 , 
more typical of individual galaxies with a surface density 
~ f arcmin -2 in the appropriate redshift range. 

As suggested by Almaini et al., an alternative explana- 
tion for their observed cross-correlations is that a higher- 
than-expected fraction of the sub-mm galaxies studied were 
located at relatively low redshifts, z<l, and that the signal 
was generated by galaxy clustering rather than by lensing 
magnification, as discussed in Section 5.2 below. However, 
only ~ 10% of our sample of sub-mm sources in GOODS-N 
lies at z < 1, and we note that, in surveys with large redshift 
depth, it is difficult to generate any significant angular cross- 
correlation amplitude from galaxy clustering without pho- 
tometric redshift information to restrict the redshift slices 
compared. 

5.2 Galaxy clustering prediction 

We now estimate the amplitude of angular cross-correlation 
resulting from the mutual clustering of sub-mm and 
optically-selected galaxies with similar photometric red- 
shifts. Both populations trace the underlying distribution of 
dark matter, which (for the purposes of this calculation) we 
will assume is clustered in accordance with a power-law spa- 
tial auto-correlation function, £(r) = (r/ro) -7 , where r is a 
co-moving separation, ro is the co-moving 'clustering length' 
at the redshift in question, and the slope 7 = 1.8. We will 
further assume that the two galaxy populations are linearly 
biased with respect to the dark matter fluctuations, pos- 
sessing clustering lengths n and r 2 . In this case, using the 
definition of bias, the spatial cross-correlation function Across 
is derived by replacing ro with (r ■ira) 1 ' 2 in the formula for 
£(r). The spatial cross-correlation function, Across (r), must 
then be projected onto an angular cross-correlation function, 

Wcross(#)- 

We consider the two cases of sub-mm galaxies with 
spectroscopic redshifts and with only photometric redshifts. 
These are compared with an optical photo- 2 catalogue. In 
the first case, the contribution from each sub-mm galaxy 
(at redshift z — zo) to the angular cross-correlation at angle 
6 can be determined by integrating Across along the line- 
of-sight at angular separation 6, weighted by the redshift 
probability distribution p(z) of the optically-selected sources 
(normalized such that J p(z) dz — 1), i.e. 

Wcross = J p(z) Across (9, z) dz 

^ (rir 2 ) 7/2 J p{z) [{x ef + (x - x a ) 2 ]-~' ,2 dz, (11) 

where x(z) is the co-moving radial co-ordinate and xq = 
x(zq). The redshift distribution p(z) is the error distribu- 
tion of the photometric redshifts for optical galaxies with 



best-fitting redshifts near z — zq. We will assume that this 
function is a Gaussian distribution with mean zo and stan- 
dard deviation a z . As the width of this function is much 
larger than the clustering length, a good approximation for 
equation 11 is 

w croBS = C 7 (r 1 r 2 ) 7/2 p(z ) (£(*>)) ' x^o) 1 " 7 ^ -7 , (12) 

where C 7 = r(i)r(2-i)/r(2). 

For the case where the sub-mm source redshift is only 
photometric, in order to obtain the resulting angular cross- 
correlation we must integrate equation 12 over redshift 
again, weighting by a further factor of p(z). Hence we as- 
sume (for the purposes of this calculation) that the photo- 
metric redshift error distribution for the sub-mm galaxies is 
the same as for the optical sources. The result is an extra 
damping of the cross-correlation amplitude: 

w CIOSa = C^r^y^e 1 -^ J p(zf (^j 1 x{z) l -<dz. (13) 

Equation 13 is in fact the usual Limber equation for the 
projection of spatial clustering (e.g. Peebles 1980). 

In order to evaluate these expressions, we take a 
photometric-redshift error o x = 0.2, which is characteris- 
tic of our data (i.e. cr z /(l + z) ~ 0.1 as noted in Section 
2). We assume that the co-moving clustering length of the 
optically-selected galaxies is constant with redshift (Lahav 
ct al. 2002), n = 5ft -1 Mpc, and take an enhanced cluster- 
ing amplitude for the sub-mm galaxies, r 2 = 7ft -1 Mpc 
(Blain et al. 2004). The predictions of equations 12 and 
13 for the cases of spectroscopic and photometric redshifts 
for the sub-mm galaxies are displayed in Fig. 9 for zo = 1 
and zo = 2. Finally, we make the approximation that each 
sub-mm galaxy is an independent probe of the clustering, 
thus the angular correlation functions determined for each 
sub-mm source (which may have a spread as indicated in 
Fig. 9) can simply be averaged. Since approximately half of 
the sub-mm galaxies have spectroscopic redshifts, the final 
result should fall somewhere in between the curves shown 
in Fig. 9. This model is a reasonable fit to our observations, 
which are also reproduced in Fig. 9 for comparison. 



6 PROSPECTS FOR FUTURE SURVEYS 

In order to measure the cosmic magnification pattern with 
reasonable accuracy (say, signal-to-noise exceeding 3 in sev- 
eral separation bins) we require cross-correlation function 
measurements with precision Sw ~ 0.002 in bins of width 
86 ~ 0.5 arcmin (see Fig. 8). The error in the correlation 
function is, roughly speaking, determined by the number of 
galaxy pairs measured: Sw ~ (with the caveats dis- 

cussed in Section 3). If £ op t is the surface density of optical 
galaxies in the appropriate redshift range, then each sub- 
mm source participates in an average of £ opt x 2ix9 89 pairs 
in a bin of average separation 6. 

For the current cosmic magnification analysis (Fig. 4), 
we have a surface density of optical sources £ op t — 10 
arcmin - in the appropriate redshift range (0.2 < z < 0.8). 
Since iV sub _ mm ~ 30, we recover 8w ~ 0.03 x ^(arcmin) -1 ^ 2 
in bins of width 86 = 0.5 arcmin, as observed in Fig. 4. 
In order to achieve a measurement of cosmic magnification 
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Figure 9. Model cross-correlation function generated by the mu- 
tual clustering of sub-mm and optically-selected galaxies. We 
compare cases where the sub-mm galaxy redshift zq is known with 
spectroscopic and photometric accuracy, for zq = 1 and zq = 2. 
We assume that the optical galaxies always have a photometric 
redshift distribution centred at zq with an r.m.s. width a z = 0.2. 
The plotted data points are a reproduction of our measurement 
from Fig. 7. 

with reasonable significance we therefore require a sub-mm 
survey with ~ 100 times as many sources. This should eas- 
ily be achieved with surveys being planned with the new 
SCUBA-2 instrument (Holland et al. 2003). However, it is 
worth bearing in mind that a high density of optical galaxies 
and reasonably good photo- z estimates will be required. 

The cross-correlation resulting from the mutual clus- 
tering of sub-mm and optically-selected galaxies should be 
easier to detect, owing to its larger amplitude (Fig. 9). 
Let us assume optical data of equivalent quality to that 
used in this study. The surface density of optical galaxies 
with 'high-quality' photometric redshifts in a redshift range 
overlapping with the sub-mm sources is again E opt — 10 
arcmin -2 , which we divide into 3 independent photo- z bins. 
Since jV su b_ mm — 15 in the region of overlap, we recover 
8w ~ 0.08 x #(arcmin) _1 ^ 2 in bins of width 88 = 0.5 arcmin. 
However, the expected amplitude of the cross-correlation 
due to clustering (Fig. 9) is significantly larger than that 
due to cosmic magnification (Fig. 8): w CIOSS ~ 0.1 — 0.2 at 
8 = 1 arcmin. We therefore conclude that a sub-mm survey 
with ~ 10 times as many sources as used in the current study 
should suffice to measure this signal with reasonable accu- 
racy, assuming an appropriate quantity of optical follow-up. 
This should be achievable for the on-going SHADES project 
(Mortier et al. 2005). Alternatively, we note that the clus- 
tering amplitude could be accurately measured using the 
current catalogues if spectroscopic redshifts were available 
for both the optical and sub-mm sources. Figure 10 illus- 
trates the accuracy of measurement achievable for each type 
of analysis through the comparison of sub-mm and optical 
catalogues. 



7 SUMMARY 

We have investigated the cross-correlation between sub-mm 
and optical sources in the GOODS-N survey region, using 
photometric redshift information. We find that: 




Figure 10. A rough indication of the accuracy with which sub- 
mm surveys yielding iV su b— mm sources can measure cosmic mag- 
nification and galaxy clustering through comparison with deep 
optical catalogues with surface density £ pt- The 5-cr threshold 
for each type of analysis is defined using estimates for the cross- 
correlation function amplitude and error within an assumed range 
of measured scales. The amplitudes are obtained from the corre- 
lation function models developed in Section 5. The errors are de- 
rived from the number of object pairs within the separation bin, 

—1/2 

assuming a Poisson error Sw = AT . For the cosmic magnifi- 
cation analysis, we assume w(8) = 0.02 at a scale 8 = 1 arcmin 
(Fig. 8) and consider a bin of width 86 = 0.5 arcmin. For the 
photo-z clustering measurement we use the same separation bin, 
but assume w(l arcmin) = 0.1 (Fig. 9) and divide the catalogues 
into three independent photo-z bins. For the estimate where all 
objects have spectroscopic redshifts, we spread the optical sources 
over a redshift range 1 < z < 2 and consider the measurement of 
a spatial correlation function f = 1 at a scale 5 h~ 1 Mpc in a bin 
of width 1 h^ 1 Mpc. 

• Comparing high-redshift (z > 1) SCUBA sources with 
low-redshift (0.2 < z < 0.8) optical galaxies, we can de- 
tect no evidence for cross-correlation due to cosmic mag- 
nification. We attribute previous reported detections to ei- 
ther: (1) a difference in the correlation function estimator 
employed; (2) analysis of a field that happens to contain a 
highly dis-proportionate concentration of massive lenses; or 
(3) a higher-than-expected fraction of sub-mm galaxies re- 
siding at relatively low redshifts z < 1. Based on calculations 
of the expected amplitude of the lensing magnification sig- 
nal in the standard cosmology, the sub-mm data-set must 
be increased in size by a factor ~ 100 to secure a significant 
measurement. 

• Comparing optical and sub-mm sources in identical 
photometric redshift slices, we detect evidence for a cross- 
correlation due to galaxy clustering (with a significance level 
of 3. 5-cr). The sub-mm sources appear to possess a higher 
bias factor than the optical galaxies (with a significance of 
2-cr). This observation, if confirmed by larger surveys, would 
support the hypothesis that sub-mm sources form in rela- 
tively dense environments in the high-redshift Universe. 

One of the primary goals of the SHADES project 
(Mortier et al. 2005) is to measure the clustering properties 
of sub-mm galaxies via an auto-correlation function analysis. 
We note that the cross-correlation with optically-selected 
galaxies could also provide valuable information (owing to 
the higher surface density of optical sources), provided that 
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the optical data are of sufficient quality. Cross-correlation 
in different redshift slices, in order to measure the lensing 
magnification, is an independent effect which should add an 
extra structure formation diagnostic to future sub-mm sur- 
veys, such as those that will be carried out with SCUBA- 
2. In terms of structure formation models, the auto- and 
cross-correlation functions have a different dependence on 
the halo occupation distribution, as well as on redshift and 
other source properties. Hence, an investigation of cross- 
correlation in future ambitious sub-mm surveys holds the 
promise of unravelling details of galaxy formation and bias 
within massive haloes. 
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